character representation
Analyzing Character Representation in Media Content using Multimodal Foundation Model: Effectiveness and Trust
Taka, Evdoxia, Bhattacharya, Debadyuti, Garde-Hansen, Joanne, Sharma, Sanjay, Guha, Tanaya
Recent advances in AI has made automated analysis of complex media content at scale possible while generating actionable insights regarding character representation along such dimensions as gender and age. Past works focused on quantifying representation from audio/video/text using AI models, but without having the audience in the loop. We ask, even if character distribution along demographic dimensions are available, how useful are those to the general public? Do they actually trust the numbers generated by AI models? Our work addresses these open questions by proposing a new AI-based character representation tool and performing a thorough user study. Our tool has two components: (i) An analytics extraction model based on the Contrastive Language Image Pretraining (CLIP) foundation model that analyzes visual screen data to quantify character representation across age and gender; (ii) A visualization component effectively designed for presenting the analytics to lay audience. The user study seeks empirical evidence on the usefulness and trustworthiness of the AI-generated results for carefully chosen movies presented in the form of our visualizations. We found that participants were able to understand the analytics in our visualizations, and deemed the tool `overall useful'. Participants also indicated a need for more detailed visualizations to include more demographic categories and contextual information of the characters. Participants' trust in AI-based gender and age models is seen to be moderate to low, although they were not against the use of AI in this context. Our tool including code, benchmarking, and the user study data can be found at https://github.com/debadyuti0510/Character-Representation-Media.
- Oceania > Australia > Australian Capital Territory > Canberra (0.05)
- Asia > China (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (5 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.88)
- Research Report > New Finding (0.68)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.68)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)
CRENER: A Character Relation Enhanced Chinese NER Model
Chinese Named Entity Recognition (NER) is an important task in information extraction, which has a significant impact on downstream applications. Due to the lack of natural separators in Chinese, previous NER methods mostly relied on external dictionaries to enrich the semantic and boundary information of Chinese words. However, such methods may introduce noise that affects the accuracy of named entity recognition. To this end, we propose a character relation enhanced Chinese NER model (CRENER). This model defines four types of tags that reflect the relationships between characters, and proposes a fine-grained modeling of the relationships between characters based on three types of relationships: adjacency relations between characters, relations between characters and tags, and relations between tags, to more accurately identify entity boundaries and improve Chinese NER accuracy. Specifically, we transform the Chinese NER task into a character-character relationship classification task, ensuring the accuracy of entity boundary recognition through joint modeling of relation tags. To enhance the model's ability to understand contextual information, WRENER further constructed an adapted transformer encoder that combines unscaled direction-aware and distance-aware masked self-attention mechanisms. Moreover, a relationship representation enhancement module was constructed to model predefined relationship tags, effectively mining the relationship representations between characters and tags. Experiments conducted on four well-known Chinese NER benchmark datasets have shown that the proposed model outperforms state-of-the-art baselines. The ablation experiment also demonstrated the effectiveness of the proposed model.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (14 more...)
Capturing Differences in Character Representations Between Communities: An Initial Study with Fandom
Sociolinguistic theories have highlighted how narratives are often retold, co-constructed and reconceptualized in collaborative settings. This working paper focuses on the re-interpretation of characters, an integral part of the narrative story-world, and attempts to study how this may be computationally compared between online communities. Using online fandom - a highly communal phenomenon that has been largely studied qualitatively - as data, computational methods were applied to explore shifts in character representations between two communities and the original text. Specifically, text from the Harry Potter novels, r/HarryPotter subreddit, and fanfiction on Archive of Our Own were analyzed for changes in character mentions, centrality measures from co-occurrence networks, and semantic associations. While fandom elevates secondary characters as found in past work, the two fan communities prioritize different subsets of characters. Word embedding tests reveal starkly different associations of the same characters between communities on the gendered concepts of femininity/masculinity, cruelty, and beauty. Furthermore, fanfiction descriptions of a male character analyzed between romance pairings scored higher for feminine-coded characteristics in male-male romance, matching past qualitative theorizing. The results high-light the potential for computational methods to assist in capturing the re-conceptualization of narrative elements across communities and in supporting qualitative research on fandom.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Nebraska (0.05)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (9 more...)
Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark
Yang, Funing, Anderson, Carolyn Jane
Several systems have been developed to extract information about characters to aid computational analysis of English literature. We propose character similarity grouping as a holistic evaluation task for these pipelines. We present AustenAlike, a benchmark suite of character similarities in Jane Austen's novels. Our benchmark draws on three notions of character similarity: a structurally defined notion of similarity; a socially defined notion of similarity; and an expert defined set extracted from literary criticism. We use AustenAlike to evaluate character features extracted using two pipelines, BookNLP and FanfictionNLP. We build character representations from four kinds of features and compare them to the three AustenAlike benchmarks and to GPT-4 similarity rankings. We find that though computational representations capture some broad similarities based on shared social and narrative roles, the expert pairings in our third benchmark are challenging for all systems, highlighting the subtler aspects of similarity noted by human readers.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (21 more...)
High Fidelity Scene Text Synthesis
Wang, Yibin, Zhang, Weizhong, Zheng, Jianwei, Jin, Cheng
Scene text synthesis involves rendering specified texts onto arbitrary images. Current methods typically formulate this task in an end-to-end manner but lack effective character-level guidance during training. Besides, their text encoders, pre-trained on a single font type, struggle to adapt to the diverse font styles encountered in practical applications. Consequently, these methods suffer from character distortion, repetition, and absence, particularly in polystylistic scenarios. To this end, this paper proposes DreamText for high-fidelity scene text synthesis. Our key idea is to reconstruct the diffusion training process, introducing more refined guidance tailored to this task, to expose and rectify the model's attention at the character level and strengthen its learning of text regions. This transformation poses a hybrid optimization challenge, involving both discrete and continuous variables. To effectively tackle this challenge, we employ a heuristic alternate optimization strategy. Meanwhile, we jointly train the text encoder and generator to comprehensively learn and utilize the diverse font present in the training dataset. This joint training is seamlessly integrated into the alternate optimization process, fostering a synergistic relationship between learning character embedding and re-estimating character attention. Specifically, in each step, we first encode potential character-generated position information from cross-attention maps into latent character masks. These masks are then utilized to update the representation of specific characters in the current step, which, in turn, enables the generator to correct the character's attention in the subsequent steps. Both qualitative and quantitative results demonstrate the superiority of our method to the state of the art.
Learning Mutually Informed Representations for Characters and Subwords
Wang, Yilin, Hu, Xinyi, Gormley, Matthew R.
Most pretrained language models rely on subword tokenization, which processes text as a sequence of subword tokens. However, different granularities of text, such as characters, subwords, and words, can contain different kinds of information. Previous studies have shown that incorporating multiple input granularities improves model generalization, yet very few of them outputs useful representations for each granularity. In this paper, we introduce the entanglement model, aiming to combine character and subword language models. Inspired by vision-language models, our model treats characters and subwords as separate modalities, and it generates mutually informed representations for both granularities as output. We evaluate our model on text classification, named entity recognition, POS-tagging, and character-level sequence labeling (intraword code-switching). Notably, the entanglement model outperforms its backbone language models, particularly in the presence of noisy texts and low-resource languages. Furthermore, the entanglement model even outperforms larger pre-trained models on all English sequence labeling tasks and classification tasks. We make our code publically available.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (9 more...)
Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution
Michel, Gaspard, Epure, Elena V., Hennequin, Romain, Cerisara, Christophe
Recent approaches to automatically detect the speaker of an utterance of direct speech often disregard general information about characters in favor of local information found in the context, such as surrounding mentions of entities. In this work, we explore stylistic representations of characters built by encoding their quotes with off-the-shelf pretrained Authorship Verification models in a large corpus of English Figure 1: Example of quotation attribution on an excerpt novels (the Project Dialogism Novel Corpus). of Pride and Prejudice by Jane Austen (1813). Results suggest that the combination of stylistic Underlined text are identified mentions, and arrows link and topical information captured in some quotes to their relevant entity mention (solid arrows are of these models accurately distinguish characters explicit references and dashed arrows are anaphoric references).
- North America > Dominican Republic (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > China > Hong Kong (0.04)
- (11 more...)
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Huang, Jing, Wu, Zhengxuan, Mahowald, Kyle, Potts, Christopher
Language tasks involving character-level manipulations (e.g., spelling corrections, arithmetic operations, word games) are challenging for models operating on subword units. To address this, we develop a causal intervention framework to learn robust and interpretable character representations inside subword-based language models. Our method treats each character as a typed variable in a causal model and learns such causal structures by adapting the interchange intervention training method of Geiger et al. (2021). We additionally introduce a suite of character-level tasks that systematically vary in their dependence on meaning and sequence-level context. While character-level models still perform best on purely form-based tasks like string reversal, our method outperforms character-level models on more complex tasks that blend form, meaning, and context, such as spelling correction in context and word search games. Compared with standard subword-based models, our approach also significantly improves robustness on unseen token sequences and leads to human-interpretable internal representations of characters.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Germany > Berlin (0.04)
- North America > Dominican Republic (0.04)
- (8 more...)
- Transportation > Infrastructure & Services (0.61)
- Transportation > Ground > Road (0.61)
Multi-level Contrastive Learning for Script-based Character Understanding
Li, Dawei, Zhang, Hengyuan, Li, Yanran, Yang, Shiping
In this work, we tackle the scenario of understanding characters in scripts, which aims to learn the characters' personalities and identities from their utterances. We begin by analyzing several challenges in this scenario, and then propose a multi-level contrastive learning framework to capture characters' global information in a fine-grained manner. To validate the proposed framework, we conduct extensive experiments on three character understanding sub-tasks by comparing with strong pre-trained language models, including SpanBERT, Longformer, BigBird and ChatGPT-3.5. Experimental results demonstrate that our method improves the performances by a considerable margin. Through further in-depth analysis, we show the effectiveness of our method in addressing the challenges and provide more hints on the scenario of character understanding. We will open-source our work on github at https://github.com/David-Li0406/Script-based-Character-Understanding.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (5 more...)
- Research Report > New Finding (0.66)
- Research Report > Experimental Study (0.46)
Beyond the Meta: Leveraging Game Design Parameters for Patch-Agnostic Esport Analytics
Chitayat, Alan Pedrassoli, Block, Florian, Walker, James, Drachen, Anders
Esport games comprise a sizeable fraction of the global games market, and is the fastest growing segment in games. This has given rise to the domain of esports analytics, which uses telemetry data from games to inform players, coaches, broadcasters and other stakeholders. Compared to traditional sports, esport titles change rapidly, in terms of mechanics as well as rules. Due to these frequent changes to the parameters of the game, esport analytics models can have a short life-spam, a problem which is largely ignored within the literature. As a case study, a neural network model is trained to predict the number of kills in a Dota 2 match utilising this novel character representation technique. The performance of this model is then evaluated against two distinct baselines, including conventional techniques. Not only did the model significantly outperform the baselines in terms of accuracy (85% AUC), but the model also maintains the accuracy in two newer iterations of the game that introduced one new character and a brand new character type. These changes introduced to the design of the game would typically break conventional techniques that are commonly used within the literature. Therefore, the proposed methodology for representing characters can increase the life-spam of machine learning models as well as contribute to a higher performance when compared to traditional techniques typically employed within the literature. Beyond the Meta: Leveraging Game Design Parameters for Patch-Agnostic Esport Analitics Introduction Esport titles, such as League of Legends and Dota 2, have amassed both large audiences and player-bases (Newzoo, 2022; Petrovskaya and Zendle, 2020). Due to the competitive nature of the genre, the player community often develop so called "metas" as explained by Kokkinakis et al. (2021). According to the author, metas are naturally discovered and developed strategies for optimum ways of playing the game that are focused in determining competitive advantage available within the current parameters of the game design.
- Europe > United Kingdom > England > North Yorkshire > York (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Russia > Ural Federal District > Sverdlovsk Oblast > Yekaterinburg (0.04)
- Asia > Indonesia > Sumatra > West Sumatra > Padang (0.04)
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)